智能论文笔记

Training for the Future: A Simple Gradient Interpolation Loss to Generalize Along Time

Anshul Nasery , Soumyadeep Thakur , Vihari Piratla , Abir De , Sunita Sarawagi

分类：机器学习 | (统计)机器学习

2021-08-15

在几个真实的世界应用中，部署机器学习模型以使数据对分布逐渐变化的数据进行预测，导致火车和测试分布之间的漂移。这些模型通常会定期在新数据上重新培训，因此他们需要概括到未来的数据。在这种情况下，有很多关于提高时间概括的事先工作，例如，过去数据的连续运输，内核平滑时间敏感参数，最近，越来越多的时间不变的功能。但是，这些方法共享了几个限制，例如可扩展性差，培训不稳定，以及未来未标记数据的依赖性。响应上述限制，我们提出了一种简单的方法，该方法以时间敏感的参数开头，但使用梯度插值（GI）丢失来规则地规则化其时间复杂度。 GI允许决策边界沿着时间改变，并且仍然可以通过允许特定于时间的改变来防止对有限训练时间快照的过度接种。我们将我们的方法与多个实际数据集的现有基线进行比较，这表明GI一方面优于更加复杂的生成和对抗方法，另一方面更简单地梯度正则化方法。

translated by 谷歌翻译

Security and Interpretability in Automotive Systems

Shailja Thakur

分类：人工智能

2022-12-23

The lack of any sender authentication mechanism in place makes CAN (Controller Area Network) vulnerable to security threats. For instance, an attacker can impersonate an ECU (Electronic Control Unit) on the bus and send spoofed messages unobtrusively with the identifier of the impersonated ECU. To address the insecure nature of the system, this thesis demonstrates a sender authentication technique that uses power consumption measurements of the electronic control units (ECUs) and a classification model to determine the transmitting states of the ECUs. The method's evaluation in real-world settings shows that the technique applies in a broad range of operating conditions and achieves good accuracy. A key challenge of machine learning-based security controls is the potential of false positives. A false-positive alert may induce panic in operators, lead to incorrect reactions, and in the long run cause alarm fatigue. For reliable decision-making in such a circumstance, knowing the cause for unusual model behavior is essential. But, the black-box nature of these models makes them uninterpretable. Therefore, another contribution of this thesis explores explanation techniques for inputs of type image and time series that (1) assign weights to individual inputs based on their sensitivity toward the target class, (2) and quantify the variations in the explanation by reconstructing the sensitive regions of the inputs using a generative model. In summary, this thesis (https://uwspace.uwaterloo.ca/handle/10012/18134) presents methods for addressing the security and interpretability in automotive systems, which can also be applied in other settings where safe, transparent, and reliable decision-making is crucial.

translated by 谷歌翻译

Multimodal and Explainable Internet Meme Classification

Abhinav Kumar Thakur , Filip Ilievski , Hông-Ân Sandlin , Alain Mermoud , Zhivar Sourati , Luca Luceri , Riccardo Tommasini

分类：人工智能 | 自然语言处理 | 机器学习

2022-12-11

Warning: this paper contains content that may be offensive or upsetting. In the current context where online platforms have been effectively weaponized in a variety of geo-political events and social issues, Internet memes make fair content moderation at scale even more difficult. Existing work on meme classification and tracking has focused on black-box methods that do not explicitly consider the semantics of the memes or the context of their creation. In this paper, we pursue a modular and explainable architecture for Internet meme understanding. We design and implement multimodal classification methods that perform example- and prototype-based reasoning over training cases, while leveraging both textual and visual SOTA models to represent the individual cases. We study the relevance of our modular and explainable models in detecting harmful memes on two existing tasks: Hate Speech Detection and Misogyny Classification. We compare the performance between example- and prototype-based methods, and between text, vision, and multimodal models, across different categories of harmfulness (e.g., stereotype and objectification). We devise a user-friendly interface that facilitates the comparative analysis of examples retrieved by all of our models for any given meme, informing the community about the strengths and limitations of these explainable methods.

translated by 谷歌翻译

Adversarial De-confounding in Individualised Treatment Effects Estimation

Vinod Kumar Chauhan , Soheila Molaei , Marzia Hoque Tania , Anshul Thakur , Tingting Zhu , David Clifton

分类：机器学习 | 人工智能

2022-10-19

Observational studies have recently received significant attention from the machine learning community due to the increasingly available non-experimental observational data and the limitations of the experimental studies, such as considerable cost, impracticality, small and less representative sample sizes, etc. In observational studies, de-confounding is a fundamental problem of individualised treatment effects (ITE) estimation. This paper proposes disentangled representations with adversarial training to selectively balance the confounders in the binary treatment setting for the ITE estimation. The adversarial training of treatment policy selectively encourages treatment-agnostic balanced representations for the confounders and helps to estimate the ITE in the observational studies via counterfactual inference. Empirical results on synthetic and real-world datasets, with varying degrees of confounding, prove that our proposed approach improves the state-of-the-art methods in achieving lower error in the ITE estimation.

translated by 谷歌翻译

SEER: Safe Efficient Exploration for Aerial Robots using Learning to Predict Information Gain

Yuezhan Tao , Yuwei Wu , Beiming Li , Fernando Cladera , Alex Zhou , Dinesh Thakur , Vijay Kumar

分类：机器人

2022-09-22

我们解决了在室内环境中对于具有有限感应功能和有效载荷/功率限制的微型航空车的高效3-D勘探问题。我们开发了一个室内探索框架，该框架利用学习来预测看不见的区域的占用，提取语义特征，样本观点，以预测不同探索目标的信息收益以及计划的信息轨迹，以实现安全和智能的探索。在模拟和实际环境中进行的广泛实验表明，就结构化室内环境中的总路径长度而言，所提出的方法的表现优于最先进的勘探框架，并且在勘探过程中的成功率更高。

translated by 谷歌翻译

Theroretical Insight into Batch Normalization: Data Dependant Auto-Tuning of Regularization Rate

Lakshmi Annamalai , Chetan Singh Thakur

分类： (统计)机器学习 | 机器学习

2022-09-15

批次归一化被广泛用于深度学习以使中间激活归一化。深层网络臭名昭著地增加了训练的复杂性，要求仔细的体重初始化，需要较低的学习率等。这些问题已通过批归一化解决（\ textbf {bn}）来解决，通过将激活的输入归功于零平均值和单位标准偏差。使培训过程的批归归量化部分显着加速了非常深网络的训练过程。一个新的研究领域正在进行研究\ textbf {bn}成功背后的确切理论解释。这些理论见解中的大多数试图通过将其对优化，体重量表不变性和正则化的影响来解释\ textbf {bn}的好处。尽管\ textbf {bn}在加速概括方面取得了不可否认的成功，但分析的差距将\ textbf {bn}与正则化参数的效果相关联。本文旨在通过\ textbf {bn}对正则化参数的数据依赖性自动调整，并具有分析证明。我们已将\ textbf {bn}提出为对非 - \ textbf {bn}权重的约束优化，通过该优化，我们通过它演示其数据统计信息依赖于正则化参数的自动调整。我们还为其在嘈杂的输入方案下的行为提供了分析证明，该方案揭示了正则化参数的信号与噪声调整。我们还通过MNIST数据集实验的经验结果证实了我们的主张。

translated by 谷歌翻译

Offline Handwritten Mathematical Recognition using Adversarial Learning and Transformers

Ujjwal Thakur , Anuj Sharma

分类：计算机视觉

2022-08-20

离线手写数学表达识别（HMER）是数学表达识别领域的主要领域。与在线HMER相比，由于缺乏时间信息和写作风格的可变性，离线HMER通常被认为是一个更困难的问题。在本文中，我们目的是使用配对对手学习的编码器模型。语义不变的特征是从手写数学表达图像及其编码器中的印刷数学表达式中提取的。学习语义不变的特征与Densenet编码器和变压器解码器相结合，帮助我们提高了先前研究的表达率。在Crohme数据集上进行了评估，我们已经能够将最新的Crohme 2019测试集结果提高4％。

translated by 谷歌翻译

Multi-fidelity wavelet neural operator with application to uncertainty quantification

Akshay Thakur , Tapas Tripura , Souvik Chakraborty

分类：机器学习

2022-08-11

操作员的学习框架由于其能够在两个无限尺寸功能空间之间学习非线性图和神经网络的利用能力，因此最近成为应用机器学习领域中最相关的领域之一。尽管这些框架在建模复杂现象方面具有极大的能力，但它们需要大量数据才能成功培训，这些数据通常是不可用或太昂贵的。但是，可以通过使用多忠诚度学习来缓解此问题，在这种学习中，通过使用大量廉价的低保真数据以及少量昂贵的高保真数据来训练模型。为此，我们开发了一个基于小波神经操作员的新框架，该框架能够从多保真数据集中学习。通过解决不同的问题，需要在两个忠诚度之间进行有效的相关性学习来证明开发模型的出色学习能力。此外，我们还评估了开发框架在不确定性定量中的应用。从这项工作中获得的结果说明了拟议框架的出色表现。

translated by 谷歌翻译

COPER: Continuous Patient State Perceiver

Vinod Kumar Chauhan , Anshul Thakur , Odhran O'Donoghue , David A. Clifton

分类：机器学习 | 人工智能

2022-08-05

在电子健康记录（EHRS）中，不规则的时间序列（ITS）自然发生，这是由于患者健康动态而自然发生，这是由于医院不规则的探访，疾病/状况以及每次访问时测量不同生命迹象的必要性。其目前的培训挑战机器学习算法主要建立在相干固定尺寸特征空间的假设上。在本文中，我们提出了一种新型的连续患者状态感知器模型，称为铜，以应对其在EHR中。铜使用感知器模型和神经普通微分方程（ODE）的概念来学习患者状态的连续时间动态，即输入空间的连续性和输出空间的连续性。神经ODES可以帮助铜生成常规的时间序列，以进食感知器模型，该模型具有处理多模式大规模输入的能力。为了评估所提出的模型的性能，我们在模仿III数据集上使用院内死亡率预测任务，并仔细设计实验来研究不规则性。将结果与证明所提出模型的功效的基准进行了比较。

translated by 谷歌翻译

Deep VULMAN: A Deep Reinforcement Learning-Enabled Cyber Vulnerability Management Framework

Soumyadeep Hore , Ankit Shah , Nathaniel D. Bastian

分类：人工智能 | 神经与进化计算

2022-08-03

网络脆弱性管理是网络安全操作中心（CSOC）的关键功能，该中心有助于保护组织免受计算机和网络系统上的网络攻击。对手比CSOC拥有不对称的优势，因为这些系统中的缺陷次数与安全团队的扩展率相比，在资源受限的环境中减轻它们的速度相比，其速度明显更高。当前的方法是确定性和一次性决策方法，在优先考虑和选择缓解漏洞时，这些方法不考虑未来的不确定性。这些方法还受到资源的亚最佳分布的约束，没有灵活性来调整其对脆弱性到达波动的响应的灵活性。我们提出了一个新颖的框架，深深的瓦尔曼，由深入的强化学习代理和整数编程方法组成，以填补网络脆弱性管理过程中的这一空白。我们的顺序决策框架首先确定在给定系统状态下不确定性下为缓解的近乎最佳的资源，然后确定最佳的缓解优先级漏洞实例。我们提出的框架优于当前方法在一年内观察到的模拟和现实世界脆弱性数据优先选择重要的组织特定漏洞。

translated by 谷歌翻译